AnchorNet: A Weakly Supervised Network to Learn Geometry-sensitive Features For Semantic Matching Supplementary material
نویسندگان
چکیده
In this section we provide additional details about the learning protocol of AnchorNet. Training converges after visiting 4× 10 training samples (for each class) in stage 1 and 1.2×10 samples in stage 2 (two days on a single GPU NVIDIA Tesla M40). The learning rate was fixed to a value of 10 with the minibatch size of 16 and the momentum set to the standard value of 0.0005. The training data were augmented as in [2]. The losses were balanced as follows. The weights of LDiscr and L Aux Discr were set to 1 and 10 respectively. The weight of LR was set to a higher value of 10 6 which is necessary due to the inhibition of the gradient by the l2 normalization which takes place just before computing LR. The weights of L A,B Div and LR were set to be as high as possible (10) such that L A,B Div ≈ LR ≈ 0 are treated approximately as hard constraints. Importantly, LR is optimized only during visiting positive samples as reconstructing the activations of negative samples would waste the capacity of the autoencoder. During the first training stage, we sample positive and negative samples with equal probability. Furthermore, during stage 2, we ensure that the distribution of positive samples is uniform over the set of 20 Pascal categories. This causes the positive samples from any given object category to be 20× less frequent than the negative samples. Hence, in order to rebalance losses in stage 2, we decrease the weights of negative samples by a factor of 20. Due to the fact that the gradients from L A,B Div exhibit high magnitudes, we decrease the learning rate on the layers bellow the first autoencoder layer by a factor of 10 during the second stage.
منابع مشابه
Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection
One of the most promising directions of deep learning is the development of self-supervised methods that can substantially reduce the quantity of manually-labeled training data required to learn a model. Several recent contributions, in particular, have proposed self-supervision techniques suitable for tasks such as image classification. In this work, we look instead at self-supervision for geo...
متن کاملWeakly Supervised Semantic Segmentation Using Superpixel Pooling Network
We propose a weakly supervised semantic segmentation algorithm based on deep neural networks, which relies on imagelevel class labels only. The proposed algorithm alternates between generating segmentation annotations and learning a semantic segmentation network using the generated annotations. A key determinant of success in this framework is the capability to construct reliable initial annota...
متن کاملDeep patch learning for weakly supervised object classification and discovery
Patch-level image representation is very important for object classification and detection, since it is robust to spatial transformation, scale variation, and cluttered background. Many existing methods usually require fine-grained supervisions (e.g., bounding-box annotations) to learn patch features, which requires a great effort to label images may limit their potential applications. In this ...
متن کاملA Weakly Supervised Approach for Semantic Image Indexing and Retrieval
This paper presents a new approach for building semantic image indexing and retrieval systems. Our approach is composed of four phases : (1) knowledge acquisition, (2) weakly-supervised learning, (3) indexing and (4) retrieval. Phase 1 is driven by a visual concept ontology which helps the expert to define low-level features useful to characterize object classes. Phase 2 uses acquired knowledge...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017